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1 Introduction 

In recent years, the study of static and dynamical properties of complex net- 
works has received a lot of attention [1—5]. Complex networks appear in such 
diverse disciplines as sociology, biology, chemistry, physics or computer science. 
In particular, great effort has been exerted to understand the behavior of tech- 
nologically based communication networks such as the Internet [6], the World 
Wide Web [7], or e-mail networks [8-10]. However, the study of communication 
processes in a wider sense is also of interest in other fields, remarkably the design 
of organizations [11,12]. For instance, it is estimated that more than a half of 
the U.S. work force is dedicated to information processing, rather than to make 
or sell things in the narrow sense [11]. 

The pioneering work of Watts and Strogatz [1] opened a completely new field 
of research. Its main contribution was to show that many real-world networks 
have properties of random graphs and properties of regular low dimensional lat- 
tices. A model that could explain this observed behavior was missing and the 
proposed " small- world" model of the authors turned the interest of a large num- 
ber of scientist in the statistical mechanics community in the direction of this 
appealing subject. Nevertheless, this simplified model gives rise to a connectiv- 
ity distribution function with an exponential form, whereas many real world 
networks show a highly skewed degree distribution, usually with a power law 
tail 

P(k) a k~~* (1) 

with an exponent 2 < 7 < 3. Barabasi and Albert [2] proposed a model where 
nodes and links are added to the network in such a way that the probability of 
the added nodes to be linked to the old nodes depend on the number of existing 
connections of the old node. This simple computational model can explain the 
power law with an exponent 7 = 3. 

Tools taken from statistical mechanics have been used to understand not 
only the topological properties of these communication networks, but also their 
dynamical properties. The main focus has been in the problem of searchability, 
although when the number of search problems that the network is trying to 
solve increases it raises the problem of congestion at some central nodes. It has 
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been observed, both in real world networks [13] and in model communication 
networks [14-18], that the networks collapse when the load is above a certain 
threshold and the observed transition can be related to the appearance of the 
1// spectrum of the fluctuations in Internet flow data [19,20]. 

These two problems, search and congestion, that have so far been analyzed 
separately in the literature can be incorporated in the same communication 
model. In previous works [16,21,18,22] we have introduced a collection of models 
that captures the essential features of communication processes and are able to 
handle these two important issues simultaneously. In these models, agents are 
nodes of a network and can interchange information packets along the network 
links. Each agent has a certain capability that decreases as the number of packets 
to deliver increases. The transition from a free phase to a congested phase has 
been studied for different network architectures in [16,18], whereas in [21] the 
cost of maintaining communication channels was considered. Finally in [22] we 
have attacked the problem of network optimization for fixed number of links and 
nodes. 

This paper is organized as follows. In Sect. 2 we present well known results 
about search in complex networks, whereas in Sect. 3 we review recent work on 
network load, being considered as a betweenness centrality and hence a static 
characterization of the network. We present the common trends of our commu- 
nication model in Sect. 4. In the next section, we show some of the exact results 
that have been obtained for a particular class of network, Cayley trees. Finally, 
in the last two sections we focus on the problem of network optimization, in 
the first one through a parameterized set of networks, including connectivities 
that can be short- or long-ranged, and different degrees of prcfercntiallity, and 
in the second one we perform an exhaustive search of optimal networks for a 
fixed number of nodes and links. 

2 Search in complex networks 

After the discovery of complex networks, one of the issues that has attracted 
a lot of attention is "search". Real complex communication networks such as 
the Internet or the World Wide Web are continuously changing and it is not 
possible to draw a map that allows to navigate in them. Rather, it is necessary 
to develop algorithms that efficiently search for the desired computers or the 
desired contents. 

The origin of the study of this problem is in sociology since the seminal 
experiment of Travers and Milgram [23]. Surprisingly, it was found that the 
average length of acquaintance chains was about six. This means not only that 
short chains exist in social networks as reported, for example, in the "small 
world" paper by Watts and Strogatz [1] , but even more striking that these short 
chains can be found using local strategies, that is without knowing exactly the 
whole structure of the social network. 

The first attempt to understand theoretically the problem of searchability 
in complex networks was provided by Kleinberg [24]. In his work, Kleinberg 




Fig. 1. Network topology and search in Kleinberg's scenario. Consider nodes A and 
B. The distance between them is Aab = 6 although the shortest path is only 3. A 
search process to get from A to B would proceed as follows. From A, we would jump 
with equal probability to D or F, since Adb = Afb = 5: suppose we choose F. The 
next jump would then be to G or C with equal probability since Acs = Aqb = 4, 
although from C it is possible to jump directly to B. This is a consequence of the local 
knowledge of the network assumed by Kleinberg. 



proposes a scenario where the network is modeled as a combination of a two- 
dimensional regular lattice plus a number of long-range links. The distance Aij 
between two nodes i and j is defined as the number of "lattice-steps" separating 
them in the regular lattice, that is disregarding long-range links (see Fig. 1). Long 
range links arc not established at random. Instead, when a node i establishes 
one of such links, it connects with higher probability with those nodes that are 
closer in terms of the distance A. In particular, the probability that the link is 
established with node j is 

n l0 ex (A,r r (2) 

where r is a parameter. 

The search algorithm proposed by Kleinberg is the following. A packet stand- 
ing at one node will be sent to the neighbor of the node that is closer to the 
destination in terms of the distance A. The algorithm is local because, as shown 
in Fig. 1, the heuristics of minimizing A does not warrant that the packet will 
follow the shortest path between its current position and its destination. There- 
fore, the underlying two-dimensional lattice has an imprecise global informa- 
tional content. 

Kleinberg showed that with this essentially local scenario (with imprecise 
global information), short paths cannot be found in general, unless the parameter 
r is fixed to r = 2. This raised the question of why real networks are then 
searchable, that is, how is it possible that in real networks local strategies are able 
to find paths that scale as log AT, where N is the size of the network. Recently, 
Watts and coworkers have shown that with an idea similar to Kleinberg's, one can 
easily obtain searchable networks [25]. Their contribution consists in substituting 
the underlying low-dimensional lattice by an ultra-metric space where individuals 
are organized in a hierarchical fashion according to their preferences, similitudes, 
etc. In this case, a broad collection of networks turn out to be searchable. 
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Parallel to these efforts, there have been some attempts to exploit the scale 
free nature of some networks to design algorithms that, being local in nature, 
are still quite efficient [26,27]. The idea in all these works is to profit from the 
scale-free nature of networks such as the Internet and bias the search towards 
those nodes that have a high connectivity and therefore act as hubs. 

3 Load and congestion in complex networks 

When the network has to tackle several simultaneous (or parallel) search prob- 
lems it raises the important issue of congestion at overburdened nodes [13-17]. 
Indeed, for a single search problem the optimal network is clearly a highly cen- 
tralized star-like structure, with one or various nodes in the center and all the 
rest connected to them. This structure is cheap to assemble in terms of number 
of links and efficient in terms of searchability, since the average cost (number of 
steps) to find a given node is always bounded (2 steps), independently of the 
size of the system. However, the star-like structure will become inefficient when 
many search processes coexist in parallel in the network, due to the limitation 
of the central node to process all the information. 

Load, independently of search, has been analyzed in different classes of net- 
works [28-31]. The load, as introduced in these works, is equivalent to the be- 
tweenness as it has been defined in social networks [32,28]. The betweenness of 
a node j, (3j, is defined as the number of minimum paths connecting pairs of 
nodes in the network that go through node j. Among the topological proper- 
ties of networks, betweenness has become one of their main characteristics. In 
principle the time needed for the computation of the betweenness of all vertices 
is of order 0(MN 2 ), where N is the number of nodes and M the number of 
links of the network. However, Newman [28] introduced an algorithm that re- 
duces the magnitude of the time needed for the computation by a factor of N. 
This definition was used to measure the social role played by scientists in some 
collaboration networks [28]. Later on, it was also applied to quantify model net- 
works. Thus, in [29] different networks are constructed and their distribution 
of betweennesses (or loads) measured. For instance, scale-free networks with an 
exponent 2 < 7 < 3 lead to a load distribution which is also a power law, 
P{i) <~ £~ s with S w 2.2. On the other side, the load distribution of small-world 
networks shows a combined behavior of two Poisson-type decays. In subsequent 
work, the authors in [31] suggested that real-world networks should be classified 
in two different universality classes, according to the exponent of the power-law 
distribution of loads. Finally, the distribution of loads was analytically computed 
for scale- free trees in [30]. 

The works discussed in the previous paragraph consider the betweenness 
as a topological property of the network, since it accounts for the number of 
shorter-paths going through a node. However, to take into account the search 
algorithm and the fact that packets can perform several random steps and then 
go through the same node more than once we introduce an effective betweenness. 
The effective betweenness of node j, Bj, represents the total number of packets 
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that would pass through j if one packet would be generated at each node at each 
time step with destination to any other node. The effective betweenness coincides 
with the topological betweenness when the nodes have complete information of 
the network structure and packets always follow the shortest paths between 
origin and destination. 

4 A model of communication 

The model that can handle search and congestion at the same time considers 
that the information is formed by discrete packets that are sent from an origin 
node to a destination node. Each node can store as many information packets 
as needed. However, the capacity of nodes to deliver information cannot be 
infinite. In other words, any realistic model of communication must consider that 
delivering, for instance, two information packets takes more time than delivering 
just one packet. A particular example of this would be to assume that nodes 
are able to deliver one (or any constant number) information packet per time 
step independently of their load, as happens in the communication model by 
Radnor [11] and in simple models of computer queues [14,15,17], but note that 
many alternative situations are possible. In the present model, each node has a 
certain capability that decreases as the load of accumulated packets increases. 
This limitation in the capability of agents to deliver information can result in 
congestion of the network. Indeed, when the amount of information is too large, 
agents are not able to handle all the packets and some of them remain undelivered 
for extremely long periods of time. The maximum amount of information that a 
network can manage gives a measure of the quality of its organizational structure. 
In the study of the model, the interest is focused in both when the congestion 
occurs and how it occurs. 

4.1 Description of the model 

The dynamics of the model is as follows. At each time step t, an information 
packet is created at every node with probability p. Therefore p is the control 
parameter: small values of p correspond to low density of packets and high val- 
ues of p correspond to high density of packets. When a new packet is created, a 
destination node, different from the origin, is chosen randomly in the network. 
Thus, during the following time steps t + 1, t + 2, . . . , t + T, the packet travels 
toward its destination. Once the packet reaches the destination node, it is deliv- 
ered and disappears from the network. Another interpretation is possible for this 
information transfer scenario. Packets can be regarded as problems that arise at 
a certain ratio anywhere in an organization. When one of such problems arises, 
it must be solved by an arbitrary agent of the network. Thus, in subsequent 
time steps the problem flows toward its solution until it is actually solved. This 
problem solving scenario can be considered a particularly illustrative case of the 
more general information transfer scenario. The problem solving interpretation 
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suggest a model similar to Garicano's [33] in that there is task diversity and 
agents are specialized in solving only certain types of tasks. 

The time that a packet remains in the network is related not only to the 
distance between the source and the target nodes, but also to the amount of 
packets in its path. Indeed, nodes with high loads — i.e. high quantities of accu- 
mulated packets — will need long times to deliver the packets or, in other words, 
it will take long times for packets to cross regions of the network that are highly 
congested. In particular, at each time step, all the packets move from their cur- 
rent position, i, to the next node in their path, j, with a probability q^. This 
probability is called the quality of the channel between i and j, and is defined 
as 

qij = y/hkj , (3) 

where fcj represents the capability of agent i and, in general, changes with time. 
The quality of a channel is, thus, the geometric average of the capabilities of the 
two nodes involved, so that when one of the agents has capability 0, the channel 
is disabled. It is assumed that fcj depends only on the number of packets at node 
i, Vi, through: 

ki = f{vi) (4) 

The function f(n) determines how the capability evolves when the number of 
packets at a given node changes. In [18] we proposed a general form although in 
this paper we will only show results for the case in which the number of delivered 
packets is constant. This particular case is consistent with simple models of 
computer queues [14], although the precise definition of the models may differ 
from ours. 

The election of the functional form for the quality of the channels and the 
capability of the nodes is arbitrary. Regarding the first, (3) is plausible for situ- 
ations in which an effort is needed from both agents involved in the communi- 
cation process. If, on the contrary, information can be transmitted without the 
collaboration of the receiver, an equation of the form 

qij = ki , (5) 

would be more adequate. Equation (5) will be used for analytical understanding 
of the problem in Sect. 7, whereas (3) is used in Sect. 5. Some of the most 
relevant features of the model, however, are not dependent on which one is used. 

4.2 Congestion and network capacity 

Depending on the ratio of generation of packets p, two different behaviors are 
observed. When the amount of packets is small, the network is able to deliver all 
the packets that are generated and, after a transient, the total load N of the net- 
work achieves a stationary state and fluctuates around a constant value. These 
fluctuations are indeed quite small. Conversely, when p is large enough the num- 
ber of generated packets is larger than the number of packets that the network 
can manage to solve and the network enters a state of congestion. Therefore, N 
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Fig. 2. Evolution of the total number of packets, N, as a function of time for a (5,7) 
Cayley tree and different values of p, below the critical congestion point (p = 1.1-10 - < 
p c ), above the critical congestion point (p = 1.5 ■ 1CP 4 > p c ), and close to the critical 
congestion point (p — 1.3 • 10~ 4 ~ p c )- Note the logarithmic scale in the Y axis. 



never reaches the stationary state but grows indefinitely in time. The transition 
from the free regime, p small, to the congested regime, p large, occurs for a well 
defined value of p, that will be denoted p c . For values smaller than but close to 
p c , the steady state is reached but large fluctuations arise. 

The three behaviors (free, congested and close to the transition) are depicted 
in Fig. 2. For p < p c , the width of the fluctuations is small, indicating short char- 
acteristic times. This means, among other thinks, that the average time required 
to deliver a packet to the destination is small. It also means that correlation times 
are short, that is, the state of the network at one time step has little influence 
on the state of the network only a few time steps latter. As p approaches p c , the 
fluctuations are wider and one can conclude that correlations become important. 
In other words, as one approaches p c the time needed to deliver a packet grows 
and the state of the network at one instant is determinant for its state many 
time steps later. In the congested regime, the amount of delivered packets is 
independent of the load and thus remains constant over time, while the number 
of generated packets is also constant, but larger than the amount of delivered 
packets. Thus, at each time step the number of accumulated packets is increased 
by a constant amount, and N(t) grows linearly in time. 

The transition from the free regime to the congested regime is therefore 
captured by the slope of N(t) in the stationary state. When all the packets are 
delivered and there is no accumulation, the average slope is while it is larger 
than for p > p c . We use this property to introduce an order parameter, 77, that 
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Fig. 3. Typical hierarchical tree structure used for simulations and calculations: in 
particular, it is a tree (3,4). Dashed line: definition of branch, as used in some of the 
calculations. 

is able to characterize the transition from one regime to the other: 

In this equation AN = N(t + At) — N(t), (. . .) indicates an average over time 
windows of width At and S is the number of nodes in the system. Essentially, the 
order parameter represents the ratio between undelivered and generated packets 
calculated at long enough times such that AN oc At. Thus, r\ is only a function 
of the probability of packet generation per node and time step, p. For p > p c , 
the system collapses, (AN) grows linearly with At and thus rj is a function of p 
only. For p < p c , (AN) = and rj = 0. Since the order parameter is continuous 
at pc, the transition to congestion is a critical phenomenon and p c is a critical 
point as usually defined in statistical mechanics [34]. 

Once the transition is characterized, the first issue that deserves attention 
is the location of the transition point p c as a function of the parameters of the 
network. This transition point gives information about the capacity of a given 
network. Indeed, the maximum number of packets that a network can handle 
per time step will be N c = Sp c . Therefore, p c is a measure of the amount of 
information an organization is able to handle and thus of the efficiency of a 
given organizational structure. One reasonable problem to propose is, therefore, 
which is the network that maximizes p c for a fixed set of available resources 
(agents and links). 

5 Analytical results for hierarchical lattices 

As a first step we considered hierarchical networks, since they provide a zeroth 
order approximation to real structures, and have also been used in the economics 
literature to model organizations [11,35]. In particular we are going to focus on 
hierarchical Cayley trees, as depicted in Fig. 3. Cayley trees are identified by 
their branching z and their number of levels m, and will be denoted (z,m) 
hereafter. 

In this case the system is regarded as hierarchical also from a knowledge 
point of view. It is assumed in the model that agents have complete knowledge 
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of the structure of the network in the subbranch they root. Therefore, when an 
agent receives a packet, he or she can evaluate whether the destination is to be 
found somewhere below. If so, the packet is sent in the right direction; otherwise, 
the agent sends the packet to his or her supervisor. Using this simple routing 
algorithm, the packets travel always following the shortest path between their 
origin and their destination. 

As happens in other problems in statistical physics [36] , the particular sym- 
metry of the hierarchical tree allows an analytical estimation of the critical point 
p c . In particular, the approach taken here is mean field in the sense that fluc- 
tuations are disregarded and only average expected values are considered. By 
using the steady state condition that the number of packets arriving at the top 
node, which is the most congested one, equals the number of packets leaving it 
we arrive to the following inequality 

Pc - z ( z m-l_!)2 — (7) 
z m_x T 1 

when the quality of the channels is given by (3). Although this expression pro- 
vides an upper bound to p c , (7) is an excellent approximation for z > 3, as 
shown in Fig. 4. 

The total critical number of generated packets, N c — p c S, with S denoting 
the size of the system, can be approximated, for large enough values of z and m 
such that z™- 1 > 1, by 

3/2 

N c = r, (8) 

z — 1 

which is independent of the number of levels in the tree. It suggests that the 
behavior of the top node is only affected by the total number of packets arriving 
from each node of the second level, which is consistent with the mean field 
hypothesis. 

According to (8), the total number of packets a network can deal with, N c , 
is a monotonically increasing function of z, suggesting that, given the number of 
agents in the organization, S, the optimal organizational structure, understood 
as the structure with highest capacity to handle information, is the flattest one, 
with m = 2 and z = S — 1 . 

To understand this result it is necessary to take into account the following 
considerations: 

• We are restricting our comparison only to different hierarchical networks 
and in any hierarchical network, the top node will receive most of the pack- 
ets. Since origins and destinations are generated with uniform independent 
probabilities, roughly (z— l)/z of the packets will pass through the top node. 

• Still, it could seem that having small z is slightly better according to the 
previous consideration. However, it is important to note that, in the present 
model (in particular due to (3)), the loads of both the sender and the receiver 
are important to have a good communication quality. In a network with small 
z, the nodes in the second level have also a high load, while in a network 
with a high z the nodes in the second level are much less loaded. 
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Fig. 4. Comparison between analytical (lines) and numerical (symbols) values obtained 
for hierarchical trees. Left: scaled critical probability (7). Right: order parameter (9). 



• We have implicitly assumed that there is no cost for an agent to have a large 
amount of communication channels active. 

For the order parameter, it is possible to derive an analytical expression for 
the simplest case where there are only two nodes that exchange packets. Since 
from symmetry considerations v\ = vii the average number of packets eliminated 
in one time step is 2, while the number of generated packets is 2 p. Thus p c = 1 
and with the present formulation of the model it is not possible to reach the 
super-critical congested regime. However, p can be extended to be the average 
number of generated packets per node at each step (instead of a probability) 
and in this case it can actually be as large as needed. As a result, the order 
parameter for the super-critical phase is r\ = (p — l)/p. As observed in Fig. 4, 
the general form 

fits very accurately the behavior of the order parameter for any Cayley tree. 



6 Optimization in model networks 

In this section we extend previous studies about local search in model networks 
in two directions. First, we consider networks that, as in Kleinberg's work, are 
embedded in a two-dimensional space, but study the effect not only of long 
range random links but also of long range preferential links. Secondly and more 
significantly, we consider the effect of congestion when multiple searches are 
carried out simultaneously. As we will show, this effect has drastic consequences 
for optimal network design. 



6.1 Network topology 

The small world model [1] considered two main components: local linking with 
neighbors and random long range links giving rise to short average distance be- 
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tween nodes. The idea of Klcinberg is that local linking provides information 
about the social structure and can be exploited to heuristically direct the search 
process. Later, Barabasi and Albert showed that growth and preferential attach- 
ment play a fundamental role in the formation of many real networks [2] . Even 
though this model captures the correct mechanism for the emergence of highly- 
connected nodes, it is not likely that it captures all mechanisms responsible for 
the evolution of "real-world" scale-free networks. In particular, it seems plausi- 
ble that in many of the networks that show scale-free behavior there is also an 
underlying structure as in the Watts and Strogatz model. To illustrate this idea, 
consider web-pages in the World Wide Web. It is plausible to assume that a 
page devoted to physics is more likely to be connected to another page devoted 
to physics than to a page devoted to sociology. That is, a set of pages devoted to 
physics is likely to be more inter-connected than a set including pages devoted 
to physics and sociology. 

Therefore we consider networks with four basic components: growth, pref- 
erential attachment, local attachment and random attachment. To create the 
network the following algorithm is used: 

1. Nodes are located in a two-dimensional square lattice without interconnect- 
ing them. 

2. A node i is chosen at random. 

3. We create m links starting at the selected node. With probability <f), the 
destination node is selected preferentially. With probability 1 — the des- 
tination node is one of the nearest neighbors of the selected node. When 
the destination node is selected preferentially, we apply the following rule: 
the probability that a given destination node j is chosen is a function of its 
connectivity 

nj oc k], (io) 

where kj is the number of links of node j and 7 is a parameter that allows 
to tune the network from maximum preferentiallity to no prcferentiallity. 
Indeed, for 7 = the links are random and for 7 = 1 we recover the BA 
model, that generates scale free networks in the case <fi = 1. For 7 > 1, a few 
nodes tend to accumulate all the links. 

4. A new node is chosen and the process is repeated from step 3, until all the 
nodes have been chosen once. 

Figure 5 shows two examples of networks in the process of being created accord- 
ing to this algorithm. Note that in this case, the number of links is fixed and the 
existence of long range links implies that some local links are not present and 
therefore that the information contained in the two-dimensional lattice is less 
precise. 

6.2 Communication model and search algorithm 

After the definition of the network creation algorithm, we move to the specifi- 
cation of the communication model and the search algorithm. For the commu- 
nication model, we will use the general model presented and discussed in Sect. 
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(a) (b) 

Fig. 5. Construction of networks with multiple linking mechanisms. In both cases (j> — 
0.25. A random node is selected at each time step and m = 4 new links starting from 
that node are created. Black nodes represent nodes that have already been selected. 
Dotted lines represent the links created during the last time step in which node C was 
selected. In (a), the destination of long range links is created at random (7 = 0), while 
in (b) they are created preferentially (7 > 0) and nodes A and B are attracting most 
of them. 



4. As already stated, this model is general enough and considers the effect of 
congestion due to limitation of ability of nodes to handle information. 

In comparison with hierarchical networks, there is only one ingredient of the 
communication model that needs to be reformulated. In the hierarchical version 
of the model, when a node receives a packet, it decides to send it downwards in 
the right direction if the solution is there, or upward to the agent overseeing her 
otherwise. This simple routing algorithm arises from the fact that we implicitly 
assume that the hierarchy is not only a communicational hierarchy, but also a 
knowledge hierarchy, where nodes know perfectly the structure of the network 
below them. In a complex network, this informational content of the hierarchy 
is lost. Here we will use Kleinberg's approach [24]. When an agent receives a 
packet, she knows the coordinates in the underlying two-dimensional space of its 
destination. Therefore, she forwards the packet to the neighbor that is closer to 
the destination according to the lattice distance A defined in Sect. 2, provided 
that the packet has not visited that node previously 1 . Note, however, that dis- 
tance refers to the two-dimensional space, but not necessarily to the topology of 
the complex network and, as in Kleinberg's work, there might be shortcuts in 
directions that increase A. Moreover, here long range links replace short range 
links and are not simply added to short range links. Therefore it is possible that 
following the direction of minimization of A the packet arrives to a dead end 
and has to go back. 

Considering this algorithm, it is interesting that the three mechanisms to 
establish links (local, random and preferential) are somehow complementary. A 
completely regular lattice (all links are local) contains a lot of information since 

1 Packets are sent to previously visited nodes only if it is strictly necessary. This 
memory restriction avoids packets getting trapped in loops 
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all the agents efficiently send their packets in the best possible direction. How- 
ever, the average path length is extremely high in this networks and therefore 
the number of packets that are flowing in the network at a given time is also very 
high. The addition of random links can reduce dramatically the average path 
length, as in small world networks. However, if the number of random links is 
very high, then the number of local links is small and thus sending the packet to 
the node closer to the destination is probably quite inefficient (since it is possible 
that, even if it is very close in the underlying two-dimensional space, there is 
no short path in the actual topology of the network) . Finally, preferential links 
seem to solve both problems. They obviously solve the long average path length 
problem but, in addition, the loss of information is not large, since the highly 
connected that actually concentrate this information. The star configuration is 
an extreme example of this: although there are no local links, the central node 
is capable of sending all the packets in the right directions. However, when the 
amount of information to handle is big, preferential links are especially inade- 
quate because highly connected nodes act as centers of congestion. Therefore, 
optimal structures should be networks where all the mechanisms coexist: com- 
plex networks. 



6.3 Results 

We simulate the behavior of the communication model in networks built accord- 
ing to the algorithm presented in Sect. 4.1. First, a value of the probability of 
packet generation per node and time step, p, is fixed. For that particular value, 
we compare the performance of different networks: networks with different pref- 
erentiallity, from random (7 = 0) to maximum centralization (7 > 1), and with 
different fraction of long range links, from pure regular lattices with no long 
range links (<f> = 0) to networks with no local component (0 = 1). For each 
collection of the parameters p, 7, and <fi, the network load, N, is calculated and 
averaged over a certain time window and over 100 realizations of the network, 
so that fluctuations due to particular simulations of the packet generation and 
of the network creation are minimized. As in the economics literature, the ob- 
jective is to minimize the average time r for a packet to go from the origin to 
the destination. 

According to Little's Law of queuing theory [37], the characteristic time is 
proportional to the average total load, N, of the network: 

5=^=4 (id 

T pS 

where p is the probability of packet generation for each node at each time step. 
Thus, minimizing the average cost of a search is equivalent to minimizing the 
total load N of the network. 

The main results are shown in Fig. 6. Consider first the behavior of the 
networks at low values of p. Figure 6. a shows the load of the network for p = 0.01 
as a function of the fraction of long range links, </>, both when they are random 
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Fig. 6. (a) and (b) Average number of packets flowing in the network as a function of 
the fraction of preferential links: (a) p = 0.01 and (b) p — 0.03. Symbol (+) corresponds 
to 7 = (random links) and symbol (x) corresponds to 7 = 6 (extremely focused links). 
Figures (c),(d) and (e) show the typical shape of complex networks with particularly 
efficient configurations: (c) 7 = and cf> = 0.12; (d) 7 = 6 and 4> = 0.07; and (e) 7 = 6 
and 4> = 1.0; 



7 = and when they are extremely preferential 7 = 6. In the last case, long 
range links are established only with the most connected node. In this case of 
small p, centralization is not a big problem because congestion effects are still not 
important. Therefore, preferential links are, in general, better than random long 
range links. In the case of preferential links, it is interesting to understand the 
behavior of the curve N ((j>). For <p — the network is a two-dimensional regular 
lattice and then the average distance between nodes is large. As some long range 
links are introduced, the average path length decreases as in the Watts-Strogatz 
model [1], and therefore the load of the network is smaller because packets reach 
their destination faster. However, the addition of long range links implies the lack 
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of local links and when <f> is further increased, the heuristic of minimizing the 
lattice distance A becomes worse and worse. This fact explains that for <f> w 0.15 
(the network is similar to the one depicted in Fig. 6.d) the load has a local 
minimum that arises due to the trade-off between the two effects of introducing 
long range preferential links: shortening of the distances that tends to decrease 
N and destruction of the lattice structure that tends to decrease the utility of 
the heuristic search and then to increase N. If <p is further increased, one node 
tends to concentrate all the links and for (f> = 1 (Fig. 6.e) the network is strictly 
a star with one central node and the rest connected to it. In this completely 
centralized situation, the lack of two-dimensional lattice is not important because 
the packets will be sent to the central node and from there directly to the 
destination. Since for small p congestion is not an issue, this structure turns out 
to be even better than the locally optimal structure with cf) w 0.15. 

The situation is different when considering higher values of the probability 
of packet generation (Fig. 6.b displays the the results for p = 0.03). Regarding 
preferential linking, the two locally optimal structures with = 0.7 and <j> = 1 
(Figs. 6.d and 6.e respectively) persist. However, in this situation and due to 
congestion considerations the first is better than the second. Thus, at some in- 
termediate value of 0.01 < p < 0.03, there is a transition such that the optimal 
structure changes from being the star configuration to being the mixed config- 
uration with local as well as preferential links. Significantly, this transition is 
sharp, meaning that there is not a continuous pass from the star to the mixed. 

Beyond the behavior of networks built with preferential long range links, it is 
worth noting that when the effect of the congestion is important (Fig. 6.b), the 
structure depicted in Fig. 6.c, where the long range links are actually thrown at 
random, becomes better than the structure in 6.d. In other words, the optimal 
network is, in this case, a completely decentralized small world network a la 
Watts-Strogatz. 

7 Optimization in a general framework 

In the previous section we have compared the behavior of networks which have 
been built following different rules (nearest neighbor linking, preferential attach- 
ment, etc.). The main reason for focusing on a particular set of networks is that 
it is very costly to compare the performance of two networks: it is necessary to 
run a simulation, wait for the stationary state and calculate the average load of 
the network. Specially, close to the critical point the time needed to reach the 
stationary state diverges. In [22] we presented a formalism that is able to cope 
with search and congestion simultaneously, allowing the determination of opti- 
mal topologies. This formalism avoids the problem of simulating the dynamics 
of the communication process and provides a general scenario applicable to any 
communication process. 

Let us focus on a single information packet at node i whose destination is 
node k. The probability for the packet to go from i to a new node j in its next 
movement is p^. In particular, p\- = Vj so that the packet is removed as soon 
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as it arrives to its destination. This formulation is completely general, and the 
precise form of p^- will depend on the search algorithm and on the connectivity 
matrix of the network. In particular, when the search is Markovian, p k - d oes no t 
depend on previous positions of the packet. In this case, the probability of going 
from i to j in n steps is given by 

p iA n )= E pUh-pU- ( 12 ) 

(l,i2v)^n — 1 

This definition allows us to compute the average number of times, 6^ that a 
packet generated at i and with destination at k passes through j. 

oo oo 

b k = £ P\n) = ]T (p k ) n = (J - /)" V . (13) 

n— 1 n—1 

and the effective betweenness of node j, Bj, is then defined as the sum over all 
possible origins and destinations of the packets, 

Bj = Y, bk ir ( 14 ) 

When the search algorithm is able to find the minimum paths between nodes, 
the effective betweenness will coincide with the topological betweenness, (3j, as 
usually defined [32,28]. 

Once, these quantities have been defined, we focus on the load of the network, 
N(t), which is the number of floating packets. These floating packets are stored 
in the nodes that act as queues. In a general scenario where packets are generated 
at random and independently at each node with a probability p, the arrival of 
packets to a given node j is a Poisson process. In the original model presented in 
Sect. 4 we assumed that the quality of the channels depend on both the sender 
and the receiver nodes; if one assumes that it only depends on the receiver node 
then the delivery of packets is also a Poisson process. In this simple picture, the 
queues arc called M/M/l in the computer science literature and the average load 
of the network is [37,22] 

S pBj_ 
J=l 1 5-1 

There are two interesting limiting cases of equation (15). When p is very small, 
taking into account that the sum of betweennesses is proportional to the average 
distance, one obtains that the load is proportional to the average effective dis- 
tance. On the other hand, when p approaches p c most of the load of the network 
comes from the most congested node, and therefore 

^~ _ 1 £ B 1 P^Po (16) 

1 S-1 

where B* is the effective betweenness of the most central node. The last results 
suggest the following interesting problem: to minimize the load of a network it 
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is necessary to minimize the effective distance between nodes if the amount of 
packets is small, but it is necessary to minimize the largest effective betweenness 
of the network if the amount of packets is large. The first is accomplished by 
a star-like network, that is, a network with one central node and all the others 
connected to it. The second, however, is accomplished by a very decentralized 
network in which all the nodes support a similar load. This behavior is similar 
to any system of queues provided that the communication depends only on the 
sender. 

It is worth noting that there are only two assumptions in the calculations 
above. The first one has already been mentioned: the movement of the packets 
needs to be Markovian to define the jump probability matrices p k . Although 
this is not strictly true in real communication networks — where packets are not 
usually allowed to go through a given node more than once — it can be seen 
as a first approximation [14,16,17]. The second assumption is that the jump 
probabilities do not depend on the congestion state of the network, although 
communication protocols sometimes try to avoid congested regions, and then 
Bj = Bj(p). However, all the derivations above will still be true in a number 
of general situations, including situations in which the paths that the packets 
follow are unique, in which the routing tables are fixed, or situations in which 
the structure of the network is very homogeneous and thus the congestion of all 
the nodes is similar. Compared to situations in which packets avoid congested 
regions, it correspond to the worst md thus provide bounds to more 

realistic scenarios in which the search algorithm interactively avoids congestion. 

Equation (15) relates a dynamical variable, the load, with the topological 
properties of the network and the properties of the algorithm. So we have con- 
verted a dynamical communication problem into a topological problem. Hence, 
the dynamical optimization procedure of finding the structure that gives the 
minimum load is reduced to a topological optimization procedure where the 
network is characterized completely by its effective betweenness distribution. In 
[22] we considered the problem of finding optimal structures for a purely local 
search, using a generalized simulated annealing (GSA) procedure, as described 
in [38,39]. On the one side, we have found that for p — > the optimal net- 
work has a star-like centralized structure as expected, which corresponds to the 
minimization of the average effective distance between nodes. On the other ex- 
treme, for high values of p, the optimal structure has to minimize the maximum 
betweenness of the network; this is accomplished by creating a homogeneous 
network where all the nodes have essentially the same degree, betweenness, etc. 
One could expect that the transition centralized-decentralized occurs progres- 
sively. Surprisingly, the results of the optimization process reveal a completely 
different scenario. According to simulations, star-like configurations are optimal 
for p < p*; at this point, the homogeneous networks that minimize B* become 
optimal. Therefore there are only two type of structures that can be optimal for 
a local search process: star-like networks for p < p* and homogeneous networks 
for p > p* . 
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Fig. 7. Optimal topologies for networks with S = 32 nodes, L = 32 links and global 
knowledge, (a) p = 0.010. (b) p = 0.020. (c) p = 0.050. (d) p = 0.080. In this case of 
global knowledge, the transition from centralization to decentralization seems smooth. 

Beyond the existence of both centralized and decentralized optimal networks, 
it is significant that the transition from one sort of networks to the other is 
abrupt, meaning that there arc no intermediate optimal structures between to- 
tal centralization and total decentralization. As already mentioned, this property 
is shared by the model networks in the previous section. Our explanation of this 
fact is the following. Since we are considering (in both the present and the last 
sections) local knowledge of the network topology, centered star-like configura- 
tions are extremely efficient in searching destinations and thus minimizing the 
effective distance between nodes. This explains that stars are optimal for a wide 
range of values of p, until the central node (or nodes) becomes congested. At 
this point, structures similar to stars will have the same problem and will be 
much worse regarding search; at this point, the only alternative is something 
completely decentralized, where the absence of congestion can compensate the 
dramatic increase in the effective distance between nodes. If this explanation is 
correct, one should be able to obtain a smooth transition from centralization to 
decentralization by considering global knowledge of the network, in such a way 
that the average effective distance (that in this case coincides with the average 
path length) is not much larger in an arbitrary network than in the star. Al- 
though we do not have extensive simulations in this case, Fig. 7 shows that there 
is some evidence to think that this is indeed the case. 

8 Summary 

We have presented some results concerning search and congestion in networks. 
By defining a communication model we have been able to cope with the problems 
of search and congestion simultaneously. For a hierarchical lattice some analytical 
results are found, by exploiting the symmetry properties of the network. For 
complex networks, this is not the case, and computational optimization to look 
for the best structures is required. On the one hand, for model networks where 
short-range, long-range, random and preferential connections are mixed we find 
that network that perform well for very low load become easily congested when 
the load is increased. On the other hand, when searching for optimal structures in 
a general scenario there is a clear transition from star-like centralized structures 
to homogeneous decentralized ones. 
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